Lab 1 - Operating System Perspective
Task: System Calls
Enter the chapters/software-stack/system-calls/drills/tasks/basic-syscall/
folder.
Run make
and then enter chapters/software-stack/system-calls/drills/tasks/basic-syscall/support/
folder and go through the practice items below.
For debugging, use strace
to trace the system calls from your program and make sure the arguments are set right.
Update the
hello.asm
and / orhello.s
files to print bothHello, world!
andBye, world!
. This means adding anotherwrite()
system call.Update the
hello.asm
and / orhello.s
files to sleep before theexit
system call.You need to make the
sys_nanosleep()
system call, with thetimespec
structure. Find its ID here.Update the
hello.asm
and / orhello.s
files to read a message from standard input and print it to standard output.You'll need to define a buffer in the
data
orbss
section. Use theread
system call to read data in the buffer. The return value ofread
(placed in therax
register) is the number of bytes read. Use that value as the 3rd argument orwrite
, i.e. the number of bytes printed.Find the ID of the
read
system call here. To find out more about its arguments, see its man page. Standard input descriptor is0
.Difficult: Port the initial program to ARM on 64 bits (also called aarch64).
Use the skeleton files in the
arm/
folder. Find information about theaarch64
system calls here.Create your own program, written in assembly, doing some system calls you want to learn more about. Some system calls you could try:
open()
,rename()
,mkdir()
. Create a Makefile for that program. Run the resulting program withstrace
to see the actual system calls being made (and their arguments).
If you're having difficulties solving this exercise, go through this reading material.
Task: System Call Wrappers
Enter the chapters/software-stack/system-calls/syscall-wrapper/drills/tasks/support/
folder and go through the practice items below.
Update the files in the
support/
folder to makeread
system call available as a wrapper. Make a call to theread
system call to read data from standard input in a buffer. Then callwrite()
to print data from that buffer.Note that the
read
system call returns the number of bytesread
. Use that as the argument to the subsequentwrite
call that prints read data.We can see that it's easier to have wrapper calls and write most of the code in C than in assembly language.
Update the files in the
support/
folder to make thegetpid
system call available as a wrapper. Create a function with the signatureunsigned int itoa(int n, char *a)
that converts an integer to a string. It returns the number of digits in the string. For example, it will convert the number1234
to the string"1234"
string (NULL
-terminated, 5 bytes long); the return value is4
(the number of digits of the"1234"
string).Then make the call to
getpid
; it gets no arguments and returns an integer (the PID - *process ID- of the current process).
If you're having difficulties solving this exercise, go through this reading material.
Task: Library Calls vs System Calls
Enter the chapters/software-stack/system-calls/drills/tasks/libcall-syscall/support/
folder and go through the practice items below.
Check library calls and system calls for the
call2.c
file. Useltrace
andstrace
.Find explanations for the calls being made and the library call to system call mapping.
If you're having difficulties solving this exercise, go through this reading material.
Modern Software Stacks
Most modern computing systems use a software stack such as the one in the figure below:
This modern software stack allows fast development and provides a rich set of applications to the user.
The basic software component is the operating system*- (OS) (technically the operating system kernel). The OS provides the fundamental primitives to interact with hardware (read and write data) and to manage the running of applications (such as memory allocation, thread creation, scheduling). These primitives form the system call API*- or system API. An item in the system call API, i.e. the equivalent of a function call that triggers the execution of a functionality in the operating system, is a system call.
The system call API is well-defined, stable and complete: it exposes the entire functionality of the operating system and hardware. However, it is also minimalistic with respect to features, and it provides a low-level (close to hardware) specification, making it cumbersome to use and not portable.
Due to the downsides of the system call API, a basic library, the standard C library*- (also called libc), is built on top of it. Because the system call API uses an OS-specific calling convention, the standard C library typically wraps each system call into an equivalent function call, following a portable calling convention. More than these wrappers, the standard C library provides its own API that is typically portable. Part of the API exposed by the standard C library is the standard C API, also called ANSI C*- or ISO C; this API is typically portable across all platforms (operating systems and hardware). This API, going beyond system call wrappers, has several advantages:
- portability: irrespective of the underlying operating system (and system call API), the API is the same
- extensive features: string management, I/O formatting
- possibility of increased efficiency with techniques such as buffering, as we show later
Analyzing the Software Stack
To get a better grasp on how the software stack works, let's do a bottom-up approach: we build and run different programs, that start off by using the system call API (the lowest layer in the software stack) and progressively use higher layers.
System Calls Explained
A system call, or syscall for short, is a method used by applications to communicate with the operating system's kernel.
The need for syscalls is tied to the modern operating systems model of conceptually separating into kernel space and user space.
The kernel space manages the hardware resources such as CPU, I/O devices, disk or memory. Moreover, the kernel also provides an interface for the user space applications to interact with the hardware.
The user space is where you are running your applications and processes. From the user space, we cannot directly access the hardware or perform privileged operations. You need to use syscalls to perform privileged operations such as accessing the hardware.
Below, you can see some examples of system calls and what resource they request from the kernel:
brk()
is used to allocate memoryopen()
is used to access the file system and open a specific filewrite()
is used to access the file system and modify the contents of a specific file
Basic System Calls
The basic-syscall/support/
folder stores the implementation of a simple program in assembly language for the x86_64 (64 bit) architecture.
The program invokes two system calls: write
and exit
.
The program is duplicated in two files using the two x86 assembly language syntaxes: the Intel / NASM syntax (hello.asm
) and the AT&T / GAS syntax (hello.s
).
The implementation follows the x86_64 Linux calling convention:
- system call ID is passed in the
rax
register - system call arguments are passed, in order, in the
rdi
,rsi
,rdx
,r10
,r8
,r9
registers
Let's build and run the two programs:
student@os:~/.../basic-syscall/support$ ls
hello.asm hello.s Makefile
student@os:~/.../basic-syscall/support$ make
nasm -f elf64 -o hello-nasm.o hello.asm
cc -nostdlib -no-pie -Wl,--entry=main -Wl,--build-id=none hello-nasm.o -o hello-nasm
gcc -c -o hello-gas.o hello.s
cc -nostdlib -no-pie -Wl,--entry=main -Wl,--build-id=none hello-gas.o -o hello-gas
student@os:~/.../basic-syscall/support$ ls
hello.asm hello-gas hello-gas.o hello-nasm hello-nasm.o hello.s Makefile
student@os:~/.../basic-syscall/support$ ./hello-nasm
Hello, world!
student@os:~/.../basic-syscall/support$ ./hello-gas
Hello, world!
The two programs end up printing the Hello, world!
message at standard output by issuing the write
system call.
Then they complete their work by issuing the exit
system call.
The write
system call writes a buffer to the file referred by the first argument, which is the file descriptor.
File descriptors are going to be studied in-depth in future chapters.
For now, it is enough for you to know that they are integers that behave like file handlers.
The 3 most common file descriptors are:
0
references the standard input (stdin
)1
references the standard output (stdout
)2
references the standard error (stderr
)
Use man 2 write
and man 3 exit
to get a detailed understanding of the syntax and use of the two system calls.
You can also check the online man pages: write
, exit
We use strace
to inspect system calls issued by a program:
student@os:~/.../basic-syscall/support$ strace ./hello-nasm
execve("./hello-nasm", ["./hello-nasm"], 0x7ffc4e175f00 /- 63 vars */) = 0
write(1, "Hello, world!\n", 14Hello, world!
) = 14
exit(0) = ?
+++ exited with 0 +++
There are three system calls captured by strace
:
execve()
: this is issued by the shell to create the new process; you'll find out more aboutexecve
in the "Compute" chapterwrite()
: called by the program to printHello, world!
to standard outputexit()
: to exit the program
This is the most basic program for doing system calls. Given that system calls require a specific calling convention, their invocation can only be done in assembly language. Obviously, this is not portable (specific to a given CPU architecture, x86_64 in our case) and too verbose and difficult to maintain. For portability and maintainability, we require a higher level language, such as C. In order to use C, we need function wrappers around system calls.
System Call Wrappers
The syscall-wrapper/support/
folder stores the implementation of a simple program written in C (main.c
) that calls the write()
and exit()
functions.
The functions are defined in syscall.asm
as wrappers around corresponding system calls.
Each function invokes the corresponding system call using the specific system call ID and the arguments provided for the function call.
The implementation of the two wrapper functions in syscall.asm
is very simple, as the function arguments are passed in the same registers required by the system call.
This is because of the overlap of the first three registers for the x86_64 Linux function calling convention and the x86_64 Linux system call convention.
syscall.h
contains the declaration of the two functions and is included in main.c
.
This way, C programs can be written that make function calls that end up making system calls.
Let's build, run and trace system calls for the program:
student@os:~/.../syscall-wrapper/support$ ls
main.c Makefile syscall.h syscall.s
student@os:~/.../syscall-wrapper/support$ make
gcc -c -o main.o main.c
nasm -f elf64 -o syscall.o syscall.s
cc -nostdlib -no-pie -Wl,--entry=main -Wl,--build-id=none main.o syscall.o -o main
student@os:~/.../syscall-wrapper/support$ ls
main main.c main.o Makefile syscall.h syscall.o syscall.s
student@os:~/.../software-stack/lab/syscall-wrapper$ ./main
Hello, world!
student@os:~/.../syscall-wrapper/support$ strace ./main
execve("./main", ["./main"], 0x7ffee60fb590 /- 63 vars */) = 0
write(1, "Hello, world!\n", 14Hello, world!
) = 14
exit(0) = ?
+++ exited with 0 +++
The trace is similar to the previous example, showing the write()
and exit()
system calls.
By creating system call wrappers as C functions, we are now relieved of the burden of writing assembly language code. Of course, there has to be an initial implementation of wrapper functions written in assembly language; but, after that, we can use C only.
Library calls vs System Calls
The standard C library has primarily two uses:
- wrapping system calls into easier to use C-style library calls, such as
open()
,write()
,read()
- adding common functionality required for our program, such as string management (
strcpy
), memory management (malloc()
) or formatted I/O (printf()
)
The first use means a 1-to-1 mapping between library calls and system calls: one library call means one system call. The second group doesn't have a standard mapping. A library call could be mapped to no system calls, one system call, two or more system calls, or it may depend (a system call may or may not happen).
The libcall-syscall/support
folder stores the implementation of a simple program that makes different library calls.
Let's build the program and then trace the library calls (with ltrace
) and the system calls (with strace
):
student@os:~/.../libcall-syscall/support$ make
cc -Wall -c -o call.o call.c
cc call.o -o call
cc -Wall -c -o call2.o call2.c
cc call2.o -o call2
student@os:~/.../libcall-syscall/support$ ltrace ./call
fopen("a.txt", "wt") = 0x556d57679260
strlen("Hello, world!\n") = 14
fwrite("Hello, world!\n", 1, 14, 0x556d57679260) = 14
strlen("Bye, world!\n") = 12
fwrite("Bye, world!\n", 1, 12, 0x556d57679260) = 12
fflush(0x556d57679260) = 0
+++ exited (status 0) +++
student@os:~/.../libcall-syscall/support$ strace ./call
[...]
openat(AT_FDCWD, "a.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fstat(3, {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
write(3, "Hello, world!\nBye, world!\n", 26) = 26
exit_group(0) = ?
+++ exited with 0 +++
We have the following mappings:
- The
fopen()
library call invokes theopenat
and thefstat
system calls. - The
fwrite()
library call invokes no system calls. - The
strlen()
library call invokes no system calls. - The
fflush()
library call invokes thewrite
system call.
This all seems to make sense.
The main reason for fwrite()
not making any system calls is the use of a standard C library buffer.
Calls the fwrite()
end up writing to that buffer to reduce the number of system calls.
Actual system calls are made either when the standard C library buffer is full or when an fflush()
library call is made.
Note that on some systems, ltrace
does not work*- as expected, due to now binding.
To avoid this behaviour, you can force the lazy binding- (based on which ltrace
is constructed to work).
An example can be found in libcall-syscall/support/Makefile
, however for system binaries, such as ls
or pwd
, the only alternative is to add the `-x ""` argument to force the command to trace all symbols in the symbol table:
student@os:~$ ltrace -x "*" ls
You can always choose what library functions ltrace
is investigating, by replacing the wildcard with their name:
student@os:~$ ltrace -x "malloc" -x "free" ls
malloc@libc.so.6(5) = 0x55c42b2b8910
free@libc.so.6(0x55c42b2b8910) = <void>
malloc@libc.so.6(120) = 0x55c42b2b8480
malloc@libc.so.6(12) = 0x55c42b2b8910
malloc@libc.so.6(776) = 0x55c42b2b8930
malloc@libc.so.6(112) = 0x55c42b2b8c40
malloc@libc.so.6(1336) = 0x55c42b2b8cc0
malloc@libc.so.6(216) = 0x55c42b2b9200
malloc@libc.so.6(432) = 0x55c42b2b92e0
malloc@libc.so.6(104) = 0x55c42b2b94a0
malloc@libc.so.6(88) = 0x55c42b2b9510
malloc@libc.so.6(120) = 0x55c42b2b9570
[...]
If you would like to know more about lazy binding, now binding*- or PLT*- entries, check out this blog post.